15 research outputs found
Adversarial Domain Adaptation for Duplicate Question Detection
We address the problem of detecting duplicate questions in forums, which is
an important step towards automating the process of answering new questions. As
finding and annotating such potential duplicates manually is very tedious and
costly, automatic methods based on machine learning are a viable alternative.
However, many forums do not have annotated data, i.e., questions labeled by
experts as duplicates, and thus a promising solution is to use domain
adaptation from another forum that has such annotations. Here we focus on
adversarial domain adaptation, deriving important findings about when it
performs well and what properties of the domains are important in this regard.
Our experiments with StackExchange data show an average improvement of 5.6%
over the best baseline across multiple pairs of domains.Comment: EMNLP 2018 short paper - camera ready. 8 page
Automatic Fact-guided Sentence Modification
Online encyclopediae like Wikipedia contain large amounts of text that need
frequent corrections and updates. The new information may contradict existing
content in encyclopediae. In this paper, we focus on rewriting such dynamically
changing articles. This is a challenging constrained generation task, as the
output must be consistent with the new information and fit into the rest of the
existing document. To this end, we propose a two-step solution: (1) We identify
and remove the contradicting components in a target text for a given claim,
using a neutralizing stance model; (2) We expand the remaining text to be
consistent with the given claim, using a novel two-encoder sequence-to-sequence
model with copy attention. Applied to a Wikipedia fact update dataset, our
method successfully generates updated sentences for new claims, achieving the
highest SARI score. Furthermore, we demonstrate that generating synthetic data
through such rewritten sentences can successfully augment the FEVER
fact-checking training dataset, leading to a relative error reduction of 13%.Comment: AAAI 202
The Limitations of Stylometry for Detecting Machine-Generated Fake News
Recent developments in neural language models (LMs) have raised concerns
about their potential misuse for automatically spreading misinformation. In
light of these concerns, several studies have proposed to detect
machine-generated fake news by capturing their stylistic differences from
human-written text. These approaches, broadly termed stylometry, have found
success in source attribution and misinformation detection in human-written
texts. However, in this work, we show that stylometry is limited against
machine-generated misinformation. While humans speak differently when trying to
deceive, LMs generate stylistically consistent text, regardless of underlying
motive. Thus, though stylometry can successfully prevent impersonation by
identifying text provenance, it fails to distinguish legitimate LM applications
from those that introduce false information. We create two benchmarks
demonstrating the stylistic similarity between malicious and legitimate uses of
LMs, employed in auto-completion and editing-assistance settings. Our findings
highlight the need for non-stylometry approaches in detecting machine-generated
misinformation, and open up the discussion on the desired evaluation
benchmarks.Comment: Accepted for Computational Linguistics journal (squib). Previously
posted with title "Are We Safe Yet? The Limitations of Distributional
Features for Fake News Detection
Towards Debiasing Fact Verification Models
Fact verification requires validating a claim in the context of evidence. We
show, however, that in the popular FEVER dataset this might not necessarily be
the case. Claim-only classifiers perform competitively with top evidence-aware
models. In this paper, we investigate the cause of this phenomenon, identifying
strong cues for predicting labels solely based on the claim, without
considering any evidence. We create an evaluation set that avoids those
idiosyncrasies. The performance of FEVER-trained models significantly drops
when evaluated on this test set. Therefore, we introduce a regularization
method which alleviates the effect of bias in the training data, obtaining
improvements on the newly created test set. This work is a step towards a more
sound evaluation of reasoning capabilities in fact verification models.Comment: EMNLP IJCNLP 201
Multi-source domain adaptation with mixture of experts
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019Cataloged from PDF version of thesis.Includes bibliographical references (pages 35-37).We propose a mixture-of-experts approach for unsupervised domain adaptation from multiple sources. The key idea is to explicitly capture the relationship between a target example and different source domains. This relationship, expressed by a point-to-set metric, determines how to combine predictors trained on various domains. The metric is learned in an unsupervised fashion using meta-training. Experimental results on sentiment analysis and part-of-speech tagging demonstrate that our approach consistently outperforms multiple baselines and can robustly handle negative transfer.by Darsh J. Shah.S.M.S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienc
Contrastive Text Generation
This thesis focuses on developing summaries that present multiple view-points on issues of interest. Such capacity is important in many areas like medical studies, where articles may not agree with each other. While the automatic summarization methods developed in the recent decade excel in single document and multi-document scenarios with high content overlap amongst inputs, there is an increasing need to automate comparative summarization. This is evident by the number of services for such reviews in the domains of law and medicine. Building on a traditional generation pipeline of planning and realization, I propose models for three scenarios with contradictions where the planners identify pertinent pieces of information and consensus to adequately realize relations between them.
First, I tackle contradictions between an old piece of text and a claim for the task of factual updates. As there is no supervision available to solve this task, our planner utilizes a fact-checking dataset to identify disagreeing phrases in an old text with respect to the claim. Subsequently, we use agreeing pairs from the fact-checking dataset to learn a text fusion realizer. Our approach outperforms several baselines on automatically updating text and on a fact-checking augmentation task, demonstrating the importance of a planner-realizer pipeline which can deal with a pair of contrastive inputs.
Second, I describe an approach for multi-document summarization, where input articles have varying degrees of consensus. In a scenario with very few parallel data points, we utilize a planner to identify key content and consensus amongst inputs, and leverage large amounts of free data to train a fluent realizer. Compared to stateof-the-art baselines, our method produces more relevant and consensus cognisant summaries.
Third, I describe an approach for comparative summarization, where a new research idea is compared and contrasted against related past works. Our planner predicts citation reasons for each input article with current research to generate a tree of related papers. Utilizing an iterative realizer to produce citation reason aware text spans for every branch, our model outperforms several state-of-the-art summarization models in generating related work for scholarly papers.Ph.D
Capturing Greater Context for Question Generation
Automatic question generation can benefit many applications ranging from dialogue systems to reading comprehension. While questions are often asked with respect to long documents, there are many challenges with modeling such long documents. Many existing techniques generate questions by effectively looking at one sentence at a time, leading to questions that are easy and not reflective of the human process of question generation. Our goal is to incorporate interactions across multiple sentences to generate realistic questions for long documents. In order to link a broad document context to the target answer, we represent the relevant context via a multi-stage attention mechanism, which forms the foundation of a sequence to sequence model. We outperform state-of-the-art methods on question generation on three question-answering datasets - SQuAD, MS MARCO and NewsQA.DSO (Grant DSOCL18002
Nutri-bullets: Summarizing Health Studies by Composing Segments
We introduce Nutri-bullets, a multi-document summarization task for health and nutrition. First, we present two datasets of food and health summaries from multiple scientific studies. Furthermore, we propose a novel extract-compose model to solve the problem in the regime of limited parallel data. We explicitly select key spans from several abstracts using a policy network, followed by composing the selected spans to present a summary via a task specific language model. Compared to state-of-the-art methods, our approach leads to more faithful, relevant and diverse summarization -- properties imperative to this application. For instance, on the BreastCancer dataset our approach gets a more than 50% improvement on relevance and faithfulness